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Abstract — In this paper, we consider tiie problem of reducing 
networli delay in stochastic network utility optimization prob- 
lems. We start by studying the recently proposed quadratic 
Lyapunov function based algorithms (QLA). We show that for 
every stochastic problem, there is a corresponding deterministic 
problem, whose dual optimal solution "exponentially attracts" the 
network backlog process under QLA. In particular, the probabil- 
ity that the backlog vector under QLA deviates from the attractor 
is exponentially decreasing in their Euclidean distance. This not 
only helps to explain how QLA achieves the desired performance 
but also suggests that one can roughly "subtract out" a Lagrange 
multiplier from the system induced by QLA. We thus develop a 
family of Fast Quadratic Lyapunov based Algorithms (FQLA) that 
achieve an [0{l/V),0{log'^{V))] performance-delay tradeoff for 
problems with a discrete set of action options, and achieve a 
square-root tradeoff for continuous problems. This is similar to 
the optimal performance-delay tradeoffs achieved in prior work 
by Neely (2007) via drift-steering methods, and shows that QLA 
algorithms can also be used to approach such performance. 

These results highlight the "network gravity" role of Lagrange 
Multipliers in network scheduling. This role can be viewed as the 
counterpart of the "shadow price" role of Lagrange Multipliers 
in flow regulation for classic flow-based network problems. 

Index Terms — Queueing, Dynamic Control, Lyapunov analy- 
sis. Stochastic Optimization 



I. Introduction 

In this paper, we consider the problem of reducing network 
delay in the following general framework of the stochastic 
network utility optimization problem. We are given a time 
slotted stochastic network. The network state, such as the 
network channel condition, is time varying according to some 
probability law. A network controller performs some action 
based on the observed network state at every time slot. 
The chosen action incurs a cost (since cost minimization is 
mathematically equivalent to utility maximization, below we 
will use cost and utility interchangeably), but also serves some 
amount of traffic and possibly generates new traffic for the 
network. This traffic causes congestion, and thus leads to 
backlogs at nodes in the network. The goal of the controller 
is to minimize its time average cost subject to the constraint 
that the time average total backlog in the network is finite. 

This setting is very general, and many existing works fall 
into this category. Further, many techniques have been used 
to study this problem (see [1] for a survey). In this paper, we 
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focus on algorithms that are built upon quadratic Lyapunov 
functions (called QLA in the following), e.g., [2], [3], [4], 
[5], [6], [7]. These QLA algorithms are easy to implement, 
greedy in nature, and are parameterized by a scalar control 
variable V. It has been shown that when the network state is 
i.i.d., QLA algorithms can achieve a time average utility that 
is within 0{1/V) to the optimal. Therefore, as V grows large, 
the time average utility can be pushed arbitrarily close to the 
optimal. However, such close-to-optimal utility is usually at 
the expense of large network delay. In fact, in [3], [4], [7], 
it is shown that an 0{V) network delay is incurred when 
an 0{1/V) close-to-optimal utility is achieved. Two recent 
papers [8] and [9], which show that it is possible to achieve 
within 0{1/V) of optimal utility with only 0(log(V^)) delay, 
use a more sophisticated algorithm design approach based 
on exponential Lyapunov functions. Therefore, it seems that 
though being simple in implementation, QLA algorithms have 
undesired delay performance. 

However, we note that the delay results of QLA are usually 
given in terms of long term upper bounds of the average 
network backlog e.g., [7]. Thus they do not examine the 
possibility that the actual backlog vector (or its time average) 
converges to some fixed value. Work in [10] considers drift 
properties towards an "invariant" backlog vector, derived in 
the special case when the problem exhibits a unique optimal 
Lagrange multiplier An upper bound on the long term devia- 
tion of the actual backlog and the Lagrange multiplier vector 
is obtained. While this suggests Lagrange multipliers are 
"gravitational attractors," the bounds in [10] do not show that 
the the actual backlog is very unlikely to deviate significantly 
from the attractor. 

In this paper, we focus on obtaining stronger probability 
results of the steady state backlog process behavior under 
QLA. We first show that under QLA, even though the backlog 
can grow linearly in V, it "typically" stays close to an "at- 
tractor," which is the dual optimal solution of a deterministic 
optimization problem. In particular, the probability that the 
backlog vector deviates from the attractor is exponentially 
decreasing in distance, which significantly tightens the at- 
tractor analysis in [10]. This implies that a large amount of 
the data is kept in the network simply for maintaining the 
backlog at the "right" level. Therefore, even if we replace 
these data with some fake data (denoted as place-holder bits 
[11]), the performance of QLA will not be heavily affected. 
Based on this finding, we propose a family of Fast Quadratic 
Lyapunov based Algorithms (FQLA), which intuitively speak- 
ing, can be viewed as subtracting out a Lagrange multiplier 
from the system induced by QLA. We show that when the 
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network state is i.i.d., FQLA is able to achieve within 0{1/V) 
of optimal utility with an 0{log^{V)) delay guarantee for 
problems with a discrete set of action options, and achieve 
an [Oil/V),0{log\V)W)] tradeoff for problems with a 
set of continuous action options. The development of FQLA 
also provides us with additional insights into QLA algorithms 
and the role of Lagrange multipliers in stochastic network 
optimization. 

The performance of FQLA is closely related to the TOCA 
algorithm in [8], which obtains the same logarithmic and 
square-root tradeoffs for the energy-delay problem (up to a 
log(y) difference) via drift steering techniques. However, we 
note that FQLA differs from TOCA in the following: First, 
TOCA in [8] is constructed based on exponential Lyapunov 
functions; while FQLA uses simpler quadratic Lyapunov func- 
tions. Second, FQLA is designed to mimic QLA, thus can 
be viewed as trying to maintain the dual variable property 
under QLA; whereas TOCA is designed to ensure the primal 
constraints are satisfied. Third, FQLA requires an arbitrary 
small but nonzero fraction of packet droppings, hence can not 
be applied to problems where packet dropping is not allowed. 

We now summarize the main contributions of this paper in 
the following: 

• This paper proves that in steady state, the backlog process 
under QLA is "exponentially attracted" to an attractor 
This fact also helps to explain how QLA achieves the 
desired performance. 

• This paper proposes a family of Fast Quadratic Lyapunov 
based Algorithms (FQLA), which are usually easy to 
implement, and can achieve an [0{l/V),0{log^{V))] 
performance-delay tradeoff for general stochastic opti- 
mization problems with a discrete set of action options as 
well as a square-root tradeoff for continuous problems. 

• This paper highlights a new functionality of Lagrange 
multipliers: the "network gravity" in network scheduling. 

The paper is organized as follows: In Section [III we set up 
our notations. In Section |III1 we state our network model. We 
then review the QLA algorithm and define the deterministic 
problem in Section |IV] In Section |V] we show that the 
backlog process under QLA always stays close to an attractor 
In Section IVII we propose the FQLA algorithm. Section 
I VII I considers single queue network problems and provides 
both deterministic and probabilistic bounds on the backlog 
size. Section fVlIII provides simulation results. We discuss the 
"gravity" role of Lagrange multipliers and relate QLA to the 
randomized incremental subgradient method (RISM) [12] in 
Section |IX] 

II. Notations 

• M: the set of real numbers 

• K+ (or R_): the set of nonnegative (or non-positive) real 
numbers 

• M" (or R"): the set of n dimensional column vectors, 
with each element being in R (or M+) 

• bold symbols x and x^: column vector and its transpose 

• x>y: vector x is entrywise no less than vector y 

• 0: column vector with all elements being 



III. System Model 

In this section, we specify the general network model we 
use. We consider a network controller that operates a network 
with the goal of minimizing the time average cost, subject 
to the queue stability constraint. The network is assumed to 
operate in slotted time, i.e., t e {0, 1, 2, ...}. We assume there 
are r > 1 queues in the network. 

A. Network State 

We assume there are a total of M different random network 
states, and define S = {si, S2, . . . , sm} as the set of possible 
states. Each particular state s; indicates the current network 
parameters, such as a vector of channel conditions for each 
link, or a collection of other relevant information about the 
current network channels and arrivals. Let S{t) denote the 
network state at time t. We assume that S{t) is i.i.d. every 
time slot, and let denote its probability of being in state Si, 
i.e., = Pr{S{t) = Si}. We assume the network controller 
can observe S(t) at the beginning of every slot t, but the ps. 
probabilities are not necessarily known. 

B. The Cost, Traffic and Service 

At each time t, after observing S{t) = Si, the controller 
chooses an action x{t) from a set X^-'^^\ i.e., x{t) = x*-**^ for 
some .t'^'' e X^^'K The set X^'^^'> is called the feasible action 
set for network state s; and is assumed to be time-invariant and 
compact for all Si G S. The cost, traffic and service generated 
by the chosen action x{t) ~ a;'-*'-' are as follows: 

(a) The chosen action has an associated cost given by 
the cost function f{t) = /(s,, x^"')) : -Y^*') i-^ R+ 
(or X'-"'^ i-> R_ in the case of reward maximization 
problems); 

(b) The amount of traffic generated by the action to 
queue j is determined by the traffic function Aj{t) ~ 
.gj(si, a;'^'*'^) : X'-''''^ i-^ R+, in units of packets; 

(c) The amount of service allocated to queue j is given by 
the rate function pj{t) = 6j(si, x^^'') : X^"''' ^ R+, in 
units of packets; 

Note that Aj (t) includes both the exogenous arrivals from out- 
side the network to queue j, and the endogenous arrivals from 
other queues, i.e., the transmitted packets from other queues, to 
queue j (See Section ITlI-CI and [III-DI for further explanations). 
We assume the functions /(s^, •), gj{si,-) and bj{si,-) are 
time-invariant, their magnitudes are uniformly upper bounded 
by some constant S„iax G (0, oo) for all Si, j, and they are 
known to the network operator We also assume that there 
exists a set of actions {x^'^'''''}iZi' ''l4''^ with x^"'^'' G X^'^'"' 
such that E.. Ps.iEk 4'\gM,x(^'^'')-b,{s,,xi^'^'^)]} < 
— e for some e > for all j, with X^i^i*'^ = 1 

(s) 

^fc > for all Si and k. That is, the constraints are feasible 
with e slackness. Thus, there exists a stationary randomized 

(s) 

policy that stabilizes all queues (where ^' represents the 
probability of choosing action a;*-"*-''^ when S{t) = Si). In the 
following, we use: 

A{t) = {A,{t),A2{t),...,Ar{t)f, (!) 

flit) = (Aii(t),Ai2W,...,/irWf , (2) 
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to denote the arrival and service vectors at time t. It is easy 
to see from above that if we define: 



A1(t)=R(t) 



B = \frbmax, 

then \ A{t) - < B for all t. 

C. Queueing, Average Cost and the Stochastic Problem 



Let U{t) = {Ui{t),...,Ur{t)f 



t = 0,1,2, 



(3) 



be 



the queue backlog vector process of the network, in units of 
packets. We assume the following queueing dynamics: 

Uj{t + l)= max [Uj (t) - fij (t) , 0] + Aj (t) Vj, (4) 

and C/(0) = 0. Note that by using (|4]i, we assume that when 
a queue does not have enough packets to send, null packets 
are transmitted. In this paper, we adopt the following notion 
of queue stability: 



E{ 



t-l r 

} ^limsup-^^E{[/,(r)} <oo. 



(5) 



=0 3 = 1 



We also use f^^ to denote the time average cost induced by 
an action-seeking policy tt, defined as: 



/:,^limsup-5^E{r(r)}, 



(6) 



where f^ir) is the cost incurred at time t by policy tt. We call 
an action-seeking policy under which ^ holds a stable policy, 
and use to denote the optimal time average cost over 
all stable policies. Every slot, the network controller observes 
the current network state and chooses a control action, with 
the goal of minimizing time average cost subject to network 
stability. This goal can be mathematically stated as: 

mill : fav, s.t. 

In the rest of the paper, we will refer to this problem as 
the stochastic problem. This stochastic problem framework 
can be used to model many network utility problems, such 
as the energy minimization problem [3] and the access point 
pricing problem [5]. We note that a similar network model 
with stochastic penalties is treated in [13] using a fluid model 
and a primal-dual approach that achieves optimality in a 
limiting sense. The framework is also treated in [7] using a 
quadratic Lyapunov based algorithm (QLA) that provides an 
explicit [0{1/V), 0{V)] performance-delay tradeoff when the 
network state is i.i.d.. 

D. An Example of the Model 

Here we provide an example to illustrate our model. Con- 
sider the 2-queue network in FiglT] Every slot, the network 
operator makes a decision on whether or not to allocate one 
unit power to serve packets at each queue, so as to support all 
arriving traffic, i.e., maintain queue stability, with minimum 
energy expenditure. Every slot, the number of arrival packets 
R{t), is i.i.d., being either 2 or with probabilities 5/8 and 
3/8 respectively. The channel states Si{t), 5*2 (t) are also i.i.d. 
being either "G=good" or "B=bad" with equal probabilities. 
One unit of power can serve 2 packets in a good channel but 



Hi (t)=A2(t) 
S1(t) 



H2(t) 



S2(t) 



Fig. 1. A 2-queue system 



can only serve one in a bad channel. Both channels can be 
activated simultaneously without affecting each other 

In this case, a network state S{t) is a {R{t), Si{t), S2{t)) 
tuple and S{t) is i.i.d.. There are eight possible network states. 
At each state Si, the action x'^'^'^ is a pair {xi,X2), with Xi 
being the amount of energy spent at queue i, and {xi,X2) G 
Xi^O = {0/1, 0/1}. The cost function is always /(s,,a;("')) = 
X1+X2 for all Sj. The network states, the traffic functions and 
service rate functions are summarized in Fig. |2] Note here 
Ai{t) = R{t) is part of S{t) and thus is independent of x''*'-'; 
while A2{t) = P-iit) hence depends on x'^'^ Also note that 
A2{t) equals pi{t) instead of mm[^i{t),Ui{t)] due to our 
idle fill assumption in Section IIII-CI 
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Fig. 2. Network state. Traffic and Rate functions 

IV. QLA AND THE Deterministic Problem 

In this section, we first review the quadratic Lyapunov func- 
tions based algorithms (the QLA algorithm) [7] for solving the 
stochastic problem. Then we define the deterministic problem 
and its dual. We then describe the ordinary subgradient method 
(OSM) that can be used to solve the dual. The dual problem 
and OSM will also be used later for our analysis of the steady 
state backlog behavior under QLA. 

A. The QLA algorithm 

To solve the stochastic problem using QLA, we first define 
a quadratic Lyapunov function L{U{t)) = \Y^^i=iU]{t)- 
We then define the one-slot conditional Lyapunov drift: 
A([/(i)) = K{L{U{t + 1)) - L{U{t)) I U{t)]. From ©, 
we obtain the following drift expression: 



A(i7(t)) <C 



AM I U{t)], 



where C = rS"^^,,^ 



V¥.{f{i)\U{t)], 
we obtain: 



Now add to both sides the term 
where y > 1 is a scalar control variable. 



E 



Vf{t) 



HU{t)) + V¥.{f{t)\U{t)] <C 

'u,{t)[^^,{t)~A,{t)\ I u{t) 



(7) 



The QLA algorithm is then obtained by choosing an action 
X at every time slot t to minimize the right hand side of ^ 
given U{t). Specifically, the QLA algorithm works as follows: 
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QLA: At every time slot t, observe the current network state 
S{t) and the backlog Uit). If S{t) = s„ choose x^"') e X'^^' l 
that solves the following; 

r 

max -Vf{si,x) + ^ Uj{t)[bj{si,x) - 5^(5^, x)] (8) 
s.t. a; 

Depending on the problem structure, ([8]) can usually be 
decomposed into separate parts that are easier to solve, e.g., 
[3], [5]. Also, it can be shown, as in [7] that, 

f^J^^^fL + Oil/V), I7^'^^ = 0(V^), (9) 

where f^J^^ is the average cost under QLA and u'^^^ is the 
time average network backlog size under QLA. 



B. The Deterministic Problem 

Consider the deterministic problem as follows: 

mill T{x)^vY,PsJ{s^,x^'''^) (10) 

Si 

s.t. Gjix) ^^ps^gj{s,,x^'''^) 

Si 

Si 

^(s.) g Vi = 1,2,...,M, 

where p^. corresponds to the probability of S{t) ~ Sj and 
X = (x^^i), x^"*^')^. The dual problem of ([TOll can be 
obtained as follows: 



max <l{U) 
s.t. U y 0, 



(11) 



where q{U) is called the dual function and is defined as: 

q{U)= inf \vY,PsJ{s^,x^''^) (12) 

Si 

j Si Si ■' 

By rearranging the terms, we note that q{U) can also be 
written in the following separable form, which is more useful 
for our later analysis. 

q{U)^ inf Vp,iF/(s„x(^-)) (13) 



Here U = {Ui, ...,Ur)'^ is the Lagrange multiplier of 
([Tol l. It is well known that q{U) in ( fT2] i is concave in the 
vector U, and hence the problem ( fTTT i can usually be solved 
efficiently, particularly when cost functions and rate functions 
are separable over different different network components. It 
is also well known that in many situations, the optimal value 
of ( fTTT i is the same as the optimal value of ( fTOl i and in this 
case we say that there is no duality gap [12]. 



We note that the deterministic problem ( fTOl i is not neces- 
sarily convex as the sets A"^^'' are not necessarily convex, and 
the functions f{si, •), gj{si, •) and bj{si, ■) are not necessarily 
convex. Therefore, there may be a duality gap between the 
deterministic problem (fTOl i and its dual ( fTTT i. Furthermore, 
solving the deterministic problem ( fTOl i may not solve the 
stochastic problem. This is so since at every network state, 
the stochastic problem may require time sharing over more 
than one action, but the solution to the deterministic problem 
gives only a fixed operating point per network state. However, 
one can show, by using an argument similar to showing the 
existence of an optimal stationary randomized algorithm in 
[5], that the dual problem (fTTT i gives the exact value of Vf*^, 
where f*^, is the optimal time average cost, even if ( fToT i is 
non-convex. 

Among the many algorithms that can be used to solve ( fTTT i, 
the following algorithm is the most common one (for per- 
formance see [12]), we denote it as the ordinary subgradient 
method (OSM): 

OSM: Initialize U{0); at every iteration t, observe U{t), 

1) Find x\^'^ e -Y^^') for J e {1, A/} that achieves the 
infimum of the right hand side of (fT2T l. 

2) Using the xu = (a^t/^ \ , found, update: 



Uj{t+ 1) = max 



U,{t)~a*Y,Ps.his^^^u^) (14) 

Si 

-gj(si,x[^'^)],0 . 



We use Sf^'"* to highlight its dependency on U{t). The term 
a* > is called the step size at iteration t. In the following, we 
will always assume a* = 1 when referring to OSM. Note that 
if there is only one network state, QLA and OSM will choose 
the same action given the same U, and they differ only by (|4| 
and (HHi. The term Gu = (Gt/,i, Gt/,2, Gu,rV, with: 



G 



U.J 



Gjixu) - B.j{xu) 



(15) 



.Ps 



1 x\j ■*)] , 



is called the subgradient of q{U) at U{t). It is well known 
that for any other JJ £ W , we have: 



[U -U[t)fGu>q{U)-q[U[t)). 
Using ||Gj7|| < B, we note that ( fT6b also impUes: 

q{u)-q{u{t))<B\\ij -u{t)\\ yu,ue'^ 



(16) 



(17) 



We are now ready to study the steady state behavior of U{t) 
under QLA. To simplify notations and highlight the scaling 
effect of the scalar V in QLA, we use the following notations: 

1) We use qo{U) and Uq to denote the dual objective 
function and an optimal solution of ( fTTT i when V ^ 1; 
and use q{U) and Uy (also called the optimal Lagrange 
multiplier) for their counterparts with general V > 1; 

(s) 

2) We use xjj to denote an action chosen by QLA 
for a given U{t) and S{t) = sf, and use xjj 

for a given U{t) 



, x^^'^)'^ to denote a solution chosen by OSM 
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To simplify analysis, we assume the following throughout: 
Assumption 1: Uy ~ (C/yj^, C/y^)^ is unique for all 
V>1. 

Note that Assumption [T] is not very restrictive. In fact, it 
holds in many network utility optimization problems, e.g., 
[10]. In many cases, we also have Uy ^ 0. Moreover, for the 
assumption to hold for all y > 1, it suffices to have just Uq 
being unique. This is shown in the following lemma regarding 
the scaling effect of the parameter V on the optimal Lagrange 
multiplier. 

Lemma 1: Uy = VU'^. 

Proof: From ( fT3] l we see that: 

q{U)lV= inf 

Si ^ 

j 

where JJj — However, the right hand side is exactly 
qo{U), and thus is maximized at U ~ Uq. Hence q{U) is 
maximized at VUq. ■ 

V. Backlog vector behavior under QLA 

In this section we study the backlog vector behavior under 
QLA of the stochastic problem. We first look at the case 
when qo(C^) is "locally polyhedral." We show that U is 
mostly within 0{log{V)) distance from Uy in this case, 
even when S{t) evolves according to a more general time 
homogeneous Markovian process. We then consider the case 
when qvi{U) is "locally smooth", and show that U is mostly 
within 0{\/V\og{V)) distance from Uy. As we will see, 
these two results also explain how QLA functions. 

A. When (jo() is "locally polyhedral" 

In this section, we study the backlog vector behavior under 
QLA for the case where qo{U) is locally polyhedral with 
parameters e, L, i.e., there exist e,L > 0, , such that for all 
U hO with \\U-Uq\\ < e, the dual function qo{U) satisfies: 



qo{U*)>qo{U) + L\\U*o-U\ 



(18) 



We will show that in this case, even if S{t) is a general 
time homogeneous Markovian process, the backlog vector will 
mostly be within 0{\og{V)) distance to Uy- Hence the same 
is also true when S{t) is i.i.d.. 

To start, we assume for this subsection that S{t) evolves 
according to a time homogeneous markovian process. Now 
we define the following notations. Given t^, define Ts.{to,k) 
to be the set of slots at which S{t) = Si for r e [to,to + k~l]. 
For a given > 0, define the convergent interval [14] for 
the S{t) process to be the smallest number of slots such that 
for any to, regardless of past history, we have: 



M 

E 



Ps 



E{\\%^{to,T,)\\\n{to)} 



T 



(19) 



here \\Ts^{to,T,y)\\ is the cardinality of Ts.{to,T,y), and 
T^ito) — {S{T)y°~Q denotes the network state history up 



to time tQ. For any > 0, such a T^, must exist for any 
stationary ergodic processes with finite state space, thus Tj^ 
exists for S{t) in particular When S{t) is i.i.d. every slot, we 
have = 1 for all > 0, as E{\\Ts^{to, 1)|| | Hita)} ^ Ps,- 
Intuitively, T^^ represents the time needed for the process to 
reach its "near" steady state. 

The following theorem summarizes the main results. Recall 
that B is defined in (|3]l as the upper bound of the magnitude 
change of C/ in a slot. 

Theorem 1: If qo{U) is locally polyhedral with constants 
e,L > 0, independent of V, then under QLA, 
(a) There exist constants > 0, D > 7] > 0, all independent 
of V, such that D = D{i^),i] = r]{v), and whenever 
\\U{t) -U*y\\> D, we have: 

E{\\U{t + T,)-U*y\\\U{t)] <\\U{t)-U*y\\-r^. (20) 

In particular, the constants v, D and i] that satisfy (|20] | 
can be chosen as follows: Choose v as any constant such 
that Q < V < L / B. Then choose 77 as any value such 
that < ri < T^{L - Bv). Finally, choose D as: Q 

{T^+T,)B^- 



D = max 



_2T^{L- ^ -Bv) 



(21) 



(b) 



For given constants v^D^-q in (a), there exist some 
constants c* ,(3* > 0, independent of V, such that: 

0'm 



V{D,m) < c* 



(22) 



where V{D,'m) is defined as: 
1 

V{D,m) ^ lim sup - ^ Pr{||L''(r) - U*y\\ > D + m}. (23) 



Note that if m 



' r=0 
log(V) 



, by ( |22] | we have V(T)^ m) < 



Also if a steady state distribution of \\U{t) — Uy\\ exists under 
QLA, i.e., the limit of i Et=o^HI|t^M - U*y\\ > D + m} 
exists as t ^ 00, then one can replace V{D,m) with the 
steady state probability that U{t) deviates from U*y by an 
amount of D + m, i.e., Pr{\\U{t) ~ U*y\\ > D + m}. 
Therefore Theorem [T] can be viewed as showing that when 
(fTSl l is satisfied, for a large V, the backlog U{t) under QLA 
will mostly be within 0{\og{V)) distance from Uy. This 
implies that the average backlog will roughly be ^ Uyj, 
which is typically Q{V) by Lemma [T] However, this fact will 
also allow us to build FQLA upon QLA to "subtract out" 
roughly J2 Uyj data from the network and reduce network 
delay. Theorem [1] also highlights a deep connection between 
the steady state behavior of the network backlog process U{t) 
and the structure of the dual function qQ{U). We note that 
(fTSl l is not very restrictive. In fact, if qo{U) is polyhedral 
(e.g., A"^"') is finite for all s^), with a unique optimal solution 
Uq >z 0, then ( fTSl l can be satisfied (see Section IVIIII for an 
example). To prove the theorem, we need the following lemma. 
Lemma 2: For any i/ > 0, under QLA, we have for all t, 



E{\\U{t + T,.)-U*yr\U{t)} 



(24) 



<\\U{t)-Ul 



~2T,{q{U*y) - q{U{t))) + 2T„vB\\U*y - U{t)\ 

'it can be seen from (TT) that B > L. Thus T^B > r/. 
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Proof: See Appendix A. ■ 
We now use Lemma [2] to prove Theorem [T] 

Proof: (Theorem [B Part (a): We first show that if ( fTSl l 
holds for qo{U) with L, then it also holds for q{U) with the 
same L. To this end, suppose (fTsl l holds for qo{U) for all U 
satisfying ||C/ — Uq\\ < e. Then for any U ^ such that 
\\U - U*y\\ < eV, we have \\U /V -Ul\\< e, hence: 

q^{Ul)>qo{U/V) + L\\Ul-UlV\\. 

Multiplying both sides by V, we get: 

Vqa{U*o) > Vqo{U/V) + LV\\U*o - U/V\\. 

Now using Uy = VU^ and q{U) = Vqo{U /V), we have 
for all \\U~U*y\\< eV: 



qiU*y)>q{U)+L\\U*y-U\\ 



(25) 



Since q{U) is concave, we see that dZSl l indeed holds for all 
U y 0. Now for a given r/ > 0, if: 

(T^ + T,)B^ - 2r,(q(C/t.) - q{U{t))) (26) 
+2T,i.B||[/t, - i7(t)|| <f]^- 2r^\\U*y ~ C/(t)|!, 

then by (|24] |. we have: 

E{||C/(t + - [/MP I C/(i)} < i\\U{t) U*y\\ vf, 
which then by Jensen's inequality implies: 

(E{||C/(i + U*y\\ I Uit)}r < iWUit) U*y\\ Vf- 

Thus ^ follows whenever ^ holds and \\U{t)-Uy\\ > r/. 
It suffices to choose D and 7] such that D > ?] and that (|26] | 
holds whenever ||C/(t) - Uy\\ > D. Now note that (ED can 
be rewritten as the following inequalty: 

qiU*y) > qiU{t)) + [Bi. + ^)\\U*y - U{t)\\ + y (27) 

where y ~ — — ^ —. Choose any > independent of 

V such that Bi/ <L and choose r/ G (0, r^(i - Bu)). By 
(l25]l, if: 

L\\U{i) - C7M1 > (Si. + ^)\\U*y - U{t)\\ + y (28) 

then ( |27] i holds. Now choose D as defined in ( l2Tl i. we see that 
if \\U{t) -U*y\\ > D, then ^ holds, which implies 
and equivalently (|26] |. We also have D > rj, hence ( l20b holds. 

Part (b): Now we show that ( [20] i implies ( l22l i. Choose 
constants i/, D and 77 that are independent of V in (a). Denote 
Y(t) = \\Uit) - Uy\\, we see then whenever Y{t) > D, 
we have ¥.{Y{t + T^) - Y{t) \ U{t)} < -rj. It is also easy 
to see that \Y{t + T^) - Y{t)\ < T^B, as B is defined in 
([3]l as the upper bound of the magnitude change of [/ in a 
slot. Define Y{t) ~ max \Y{t) — Z?, O] . We see that whenever 
Y{t) > T^B, we have: 

¥.{Y{t + T,)-Y{t)\U{t)] (29) 
= ^{Y{t + T,)-Y{t)\U{t)] <~r^. 



Now define a Lyapunov function of Y{t) to be L{Y{t)) ~ 
Q^Y{t) ^jjjj some w > 0, and define the T,y-slot conditional 
drift to be: 

AT„(nt)) ^ ¥.{L{Y{t + T,))-L{YmU{t)] 

= E{e'"^(*+^-)-e'"^(*)|t/W}- (30) 
It is shown in Appendix B that by choosing w — 

T'^B'^+T,Bri/3 ' "^^ ^^^^ f""" ^(^) - 

Ar„(r(i)) < e2-^"^-^e^"^(*). (31) 
Taking expectation on both sides, we have: 

]E{e-5'(*+^'^) - e-^(*)} < e^-^^^ - !!^E{e^^^'^. (32) 

Now summing (l32li over t e {to,tQ + T^, ...,to + {N ~ 1)T^} 
for some to e {0, 1, T^, — 1}, we have: 

N-l 



}■ 



3=0 



Rearrange the terms, we have: 

N-l 



Summing the above over to E {0, 1, Ti/ — 1}, we obtain: 



t=0 to=0 

Dividing both sides with NT^, we obtain: 

^ g !;^E{e"'^(')} <e2"'^'^^ (33) 

T„-l 



t=0 



1 



to=0 



Taking the limsup as N goes to infinity, we obtain: 



t-i 



limsup i^^E{e"'^M} < 



t ^ 2 

T=0 



Using the fact that Eje^'^'^^)} > e"""Pr{y(r) > m}. 



1 

lim sup - ^ !!^e'™Pr{f(T) > m} < 



t ^ 2 



e^-^"^. (35) 



Plug in — j^'ig'ij^j' Brj/s ^'^'^ '^^^ ''^^ definition of i^(t): 



V{D,m) < 



2e 



2wTuB 



wrj 



(36) 



^ 2{T^B^ +nBi]/3)e'r^ 



where V{D,m) is defined in ( |23] |. Therefore (l22l) holds with: 



2(T2b2 + TyBi-i/'i)eT^^ 



73 



T^B^+nBT]/3' 



(37) 
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It is easy to see that c* and /3* are both independent of V. ■ 
Note from ( [33] ) and ( [34l i that Theorem [1] indeed holds for 
any finite C/(0). We will later use this fact to prove the 
performance of FQLA. The following theorem is a special case 
of Theorem [T] and gives a more direct illustration of Theorem 
[1] Recall that ViD, m) is defined in (|23]l. Define: 

t-i 



(38) 



lim sup - P7'{3_7, |C/j(r) 



m}. 



T=0 



Theorem 2: If the condition in Theorem [T| holds and S{t) 
is i.i.d., then under QLA, for any c > 0: 



ViDi,cKi \ogiV)) 
7'Mpi,c/ulog(l/)) 



< 



< 



3_ 



(39) 
(40) 



where Di 



■2B± 
L 



B^+BL/6 
L/2 



and 



8(B^+BL/6)e"BTL76 
1? ■ 

Proof: First we note that when S{t) is i.i.d., we have 
T,, = 1 for = 0. Now choose u = 0, T,, = 1 and r] = L/2, 
then we see from ( l2Tl i that 



Z) = max 



L 



< 



2B^ 



L 
4' 



Now by we see that (|22]i holds with c* = c* and /3* 
B'^+BL/e - Thus by taking 

r{D^,cKi log(F)) 



25 
L 



■J, we have: 



where the last step follows since /3*ii'i = 1. Thus ( |39] l follows. 
Equation ( l40l i follows from (|39] l by using the fact that for any 
constant C, the events £i = {3j, \Uj{T) — Uyj\ > (} and £2 = 
{||(7(T)-{7t.|| > C} satisfy £1 C £2- Thus: Pr{3 j, |?7j(t)- 
U^,\>C}<Pr{\\U{T)-U*y\\>C}- ■ 

Theorem |2] can be viewed as showing that for a large V, 
the probability for Uj{t) to deviate from the 7*'' component of 
Uy is exponentially decreasing in the distance. Thus it rarely 
deviates from Uyj by more than 0(log(y)) distance. Note 
that one can similarly prove the following theorem for OSM: 

Theorem 3: If the condition in Theorem [T| holds, then 
there exist positive constants D ~ 0(1) and t] = 0(1), i-e, 
independent of V, such that, under OSM, if \\U{t) -Uy\\ > 
D, 

\\Uit + 1) - U*y\\ < \\U{t) -U*y\\- V- (41) 

Proof: It is easy to show that under OSM, Lemma|2]holds 
with 1/ = 0, ^ 1 and without the expectation. Indeed, by 
(fT4l i. ( fTST i and Lemma [8] in Appendix A, we have: 



\U{t + l)-Ul 



< \\U{t) -U*yf + 2B^ 



-2{U*y-U{t)fGu 

Now by (HSll we have: {U*y-U{t)Y' Gu > q{U*y)-q{U{t)). 
Plug this into the above equation, we obtain: 



\U{t + l)-U\ 



< \\U{t) -U*yf + 2B^ 



-2{qiU*y)-q{U{t))) 



The theorem then follows by using the same argument as in 
the proof of Theorem [T] ■ 
Therefore, when there is a single network state, we see that 
given dTSI ), the backlog process converges to a ball of size 
8(1) around U*y. 

B. When qoO is "locally smooth" 

In this section, we consider the backlog behavior under 
QLA, for the case where the dual function qoiU) is "locally 
smooth" at Uq. Specifically, we say that the function qo{U) 
is locally smooth at Uq with parameters e,L > if for all 
U ^0 such that \\U ~ Un\\ < e, we have: 



qo{U*o)>qo{U)+L\\U-U*o 



(42) 



This condition contains the case when qQ{U) is twice differ- 
entiable with VqiU^) = and x^V'^q{U)x < -2L\\x\\'^ for 
any U with \\Uq — U\\ < e. Such a case usually occurs when 
the sets X^'^^\i = 1, M aie convex, thus a "continuous" set 
of actions are available. Notice that (l42l i is a looser condition 
than (fTSl l in the neighborhood of Uq. As we will see, such 
structural difference of qo{U) in the neighborhood of Uq 
greatly affects the behavior of backlogs under QLA. 

Theorem 4: If qo{U) is locally smooth at Uq with param- 
eters e,L > 0, independent of V, then under QLA with a 
sufficiently large V, we have: 

(a) There exists D ~ Q{\/V) such that whenever ||i7 — 
Uy\\ > D, we have: 



U{t)} < \\U{t)-U*y\\-—. (43) 



i{\\U{t + l)-Ul 



(b) V{D,m) < c*e-^*™, where V{D,m) is defined in 
(El, c* = Q{V) and (3* = Q{l|^/V). 

Theorem |4] can be viewed as showing that, when qo{U) 
is locally smooth at Uq, the backlog vector will mostly be 
within 0{VV log{V)) distance from Uy. This contrasts with 
Theorem [T] which shows that the backlog will mostly be 
within 0(log(y)) distance from Uy. Intuitively, this is due to 
the fact that under local smoothness, the drift towards Uy is 
smaller as U gets closer to Uy, hence a Q{\/V) distance is 
needed to guarantee a drift of size 0(1/ Vy); whereas under 
(fTSl l. any nonzero 9(1) deviation from Uy roughly generates 
a drift of size 6(1) towards Uy, ensuring the backlog stays 
within 0{\og{V)) distance from Uy. To prove Theorem |4] 
we need the following corollary of Lemma [2] 

Corollary 1: If S{t) is i.i.d., then under QLA, 

m\U{t + 1) - U*y\\^ I U{t)} < \\U{t) - U*y\\^ + 2B^ 

-2{q{U*y)~q{U{t))). 
Proof: When S{t) is i.i.d., we have = 1 for v = Q. 

■ 

Proof: (Theorem Part (a): We first see that for any U 
with \\U-U*y\\ < eV, we have \\U /V -U*q\\ < e. Therefore, 

qoiU*Q)>qQ{U/V)+L\\U/V-U*Qf. 



Multiply both sides with V, we get 

L 



q{U*y) > q{U) 



U-UIA\\ 



(44) 



(45) 
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Similar as in the proof of Theorem [T] and by Corollary [T] we 
see that for (|43li to hold, we need ||L''(t) - Uy\\ > and: 

2B^ 2{qiU*y) qiUit))) < i - -^\\U{t) -U*y\l 



which can be rewritten as: 



1 



qiU*y)>qiUit)))+^\\U{t)-Ul.\ 



By (l45T l. we see that for (|46] | to hold, we only need: 

Ll\U-U*yr>^\\U^U*y\\ 

It is easy to see that ( |47] | holds whenever: 



(46) 



(47) 



\U^U*y\\ > 



1 

w 



V 



2L/V 



2L 



Denote D = ^+^^+^^'^ . We see now when V is large, 
(l43T l holds for any U with D < \\U-Uy\\ <eV. Now since 
q{U) is concave, it is easy to show that ( |46] | holds for all 
\\U - U*y\\ > D. Hence ^ holds for all \\U - U*y\\ > D, 
proving Part (a). 

Part (b): By an argument that is similar as in the proof of 
Theorem[T] we see that Part (b) follows with: /3* = ^^^■2_^g 

and c* = 2{VB^ + BVV/3)e^B^+K ■ 
Notice in this case we can also prove a similar result as 
Theorem [3] for OSM, with the only difference that D — 

&{Vv). 

C. Implications of Theorem Q] and |4] 

Consider the following simple problem: an operator operates 
a single queue and tries to support a Bernoulli arrival, i.e., 
either 1 or packet arrives every slot, with rate A = 0.5 (the 
rate may be unknown to the operator) with minimum energy 
expenditure. The channel is time-invariant. The rate-power 
curve over the channel is given by: fi{t) = log(l + PW{t)), 
where PW{t) is the allocated power at time t. Thus to obtain 
a rate of we need PW{t) = e^*^*-* — 1. Every time slot, the 
operator decides how much power to allocate and serves the 
queue at the corresponding rate, with the goal of minimizing 
the time average power consumption subject to queue stability. 
Let $ denote the time average energy expenditure incurred by 
the optimal policy. It is not difficult to see that $ = e°'^ — 1. 

Now we look at the deterministic problem: 



s.t. : 



Vief" - 1) 
0.5 < fi 



It is easy to obtain q{U) = inf^ {l/(e^ - 1) + C^(0.5 - fi)}. 
Hence by the KKT conditions [12] one obtains that Uy = 
Ve'^'^ and the optimal policy is to serve the queue at the 
constant rate /i* ~ 0.5. Suppose now QLA is applied to the 
problem. Then, at every slot t, given U{t) = U, QLA chooses 
the power to achieve the rate such that: 

^l{t) G argmin{F(e^ - 1) + U{0.5 - = log(^). (48) 



which incurs an instantanous power consumption of PW{t) = 
Now by Theorem H for most of the time U{t) e [Uy - 
VV,U^ + VV], i.e., U{t) G - VV,Ve°-^ + vV]. 

Hence it is almost always the case that: 

log(e°-5 _ 4?) < m(0 < log(e°-^ + 



which implies: 0.5—-^ < ^{t) < 0.5+^^. Thus by a similar 
argument as in [8], one can show that PW < $ + 0{l/V), 
where PW is the average power consumption. 

Now consider the case when we can only choose to operate 
at /i e {0, 1;, |, 1}, with the corresponding power consump- 
tions being: PW € {0, — 1, ei — 1, e— 1}. One can similarly 
obtain <i> — \{e^ +e3) and Uy = 2V{e~ ~e^). In this case, 
$ is achieved by time sharing the two rates { ^ , | } with equal 
portion of time. Now by Theorem [1] we see that under QLA, 
U{t) is mostly within log(V^) distance to Uy. Hence by ( |48l ), 
we see that QLA almost always chooses between the two rates 
{i, |}, and uses them with almost equal frequencies. Hence 
QLA is also able to achieve PW = $ + 0{1/V) in this case. 

The above argument can be generalized to many stochastic 
network optimization problems. Thus, we see that Theorem [1] 
and m not only provide us with probabilistic deviation bounds 
of U{t) from [/*, but also help to explain why QLA is able 
to achieve the desired utility performance: under QLA,U{t) 
always stays close to Uy, hence the chosen action is always 
close to the set of optimal actions. 



VI. The FQLA Algorithm 

In this section, we propose a family of Fast Quadratic 
Lyapunov based Algorithms (FQLA) for general stochastic 
network optimization problems. We first provide an example 
to illustrate the idea of FQLA. We then describe FQLA with 
known Uy, called FQLA-Ideal, and study its performance. 
After that, we describe the more general FQLA without 
such knowledge, called FQLA-General. For brevity, we only 
describe FQLA for the case when 90 (t/) is locally polyhedral. 
FQLA for the other case is briefly discussed in Section IVI-EI 

A. FQLA: a Single Queue Example 

To illustrate the idea of FQLA, we first look at an example. 
Figure [3] shows a 10^-slot sample backlog process under 
qlaS We see that after roughly 1500 slots, U{t) always stays 
very close to Uy, which is a Q{V) scalar in this case. To 
reduce delay, we can first find W € (0, Uy) such that: under 
QLA, there exists a time so that J7(to) > W and once 
U{t) > W, it remains so for all time (the solid line in Fig. 
|3] shows one for these 10^ slots). We then place W fake bits 
(called place-holder bits [11]) in the queue at time 0, i.e., 
initialize U{Q) = W, and run QLA. It is easy to show that 
the utility performance of QLA will remain the same with 
this change, and the average backlog is now reduced by W. 
However, such a W may require W = Uy — Q{V), thus the 
average backlog may still be Q{V). 

^This sample backlog process is one sample backlog process of queue 1 
of the system considered in Section IVIIII under QLA with V = 50. 
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holder bits 



ioaa 40oo eoQci sooo 10000 



Fig. 3. Left: A sample backlog process; Right: An Example of W{t) and 
Uit). 

FQLA instead finds a W such that in steady state, the 
backlog process under QLA rarely goes below it, and places 
W place-holder bits in the queue at time 0. FQLA then 
uses an auxiliary process W{t), called the virtual backlog 
process, to keep track of the backlog process that should have 
been generated if QLA is used. Specifically, FQLA initializes 
W{0) = yV. Then at every slot, QLA is run using W{t) 
as the queue size, and W{t) is updated according to QLA. 
With W(t) and W, FQLA works as follows: At time t, if 
W{t) > W, FQLA performs QLA's action (obtained based on 
S{t) and W{t)y, else if W{t) < W, FQLA carefully modifies 
QLA's action so as to maintain U{t) « max[W^(t) — W, 0] 
for all t (see Fig[3] for an example). Similar as above, this 
roughly reduces the average backlog by W. The difference 
is that now we can show that W — max[C/y — log^(y),0] 
meets the requirement. Thus it is possible to bring the average 
backlog down to 0{log^(y)). Also, since W{t) can be viewed 
as a backlog process generated by QLA, it rarely goes below 
W in steady state. Hence FQLA is almost always the same 
as QLA, thus is able to achieve an 0(1/1^) close-to-optimal 
utility performance. 

B. The FQLA-Ideal Algorithm 

In this section, we present the FQLA-Ideal algorithm. We 
assume the value Uy = [Uy-^, Uy^)'^ is known a-priori. 
FQLA-Ideal: 

(I) Determining place-holder bits: For each j, define: 

= max [U^^ - log2 (y ) , O] , (49) 

as the number of place-holder bits of queue j. 
(II) Place-holder-bit based action: Initialize 

[/^■(o) = o, Wj{{)) = Wj, yj. 

For t > 1, observe the network state S{t), solve (O with 
W{t) in place of U{t). Perform the chosen action with 
the following modification: Let A{t) and fjL{t) be the 
arrival and service rate vectors generated by the action. 
For each queue j, do (Idle fill whenever needed): 

a) If Wj{t) > Wj-. admit Aj{t) arrivals, serve ^j(t) 
data, i.e., update the backlog by: 

Uj{t + 1) = max [Uj{t) - ^ij{t), 0] + Aj{t). 

b) lfWj{t) < Wf. admit = max [Aj{t)-Wj + 
Wj{t), 0] arrivals, serve iJ,j{t) data, i.e., update the 
backlog by: 

Uj{t + 1) = max 



c) Update Wj{t) by: 

Wj{t + 1) = max [Wj{t) - Hj{t), O] + Aj{t). 

From above we see that FQLA-Ideal is the same as QLA 
based on W{t) when Wj{t) > Wj for all j. When Wj{t) < 
Wj for some queue j, FQLA-Ideal admits roughly the ex- 
cessive packets after Wj {t) is brought back to be above Wj 
for the queue. Thus for problems where QLA admits an easy 
implementation, e.g., [3], [5], it is also easy to implement 
FQLA. However, we also notice two different features of 
FQLA: (1) By ( |49] l. Wj can be 0. However, when V is large, 
this happens only when Uq^ = Uyj — according to Lemma 
[U In this case Wj ^ Uyj ~ 0, and queue j indeed needs zero 
place-holder bits. (2) Packets may be dropped in Step Il-(b) 
upon their arrivals, or after they are admitted into the network 
in a multihop problem. Such packet dropping is natural in 
many flow control problems and does not change the nature 
of these problems. In other problems where such option is not 
available, the packet dropping option is introduced to achieve 
desired delay performance, and it can be shown that the 
fraction of packets dropped can be made arbitrarily small. Note 
that packet dropping here is to compensate for the deviation 
from the desired Lagrange multiplier, thus is different from 
that in [15], where packet dropping is used for drift steering. 

C. Performance of FQLA-Ideal 

We look at the performance of FQLA-Ideal in this section. 
We first have the following lemma that shows the relation- 
ship between U{t) and W{t) under FQLA-Ideal. We will 
use it later to prove the delay bound of FQLA. Note that 
the lemma also holds for FQLA-General described later, as 
FQLA-Ideal/General differ only in the way of determining 

W=iWi,...,Wrf. 

Lemma 3: Under FQLA-Ideal/General, we have V j, t: 

max [Wj{t)^Wj,0] < Uj{t) < max [Wj{t)^Wj ,0]+S^a. 

(50) 

where 6max is defined in Section IIII-BI to be the upper bound 
of the number of arriving or departing packets of a queue. 
Proof: See Appendix C. ■ 

The following theorem summarizes the main performance 
results of FQLA-Ideal. Recall that for a given policy tt, f^^ 
denotes its average cost defined in ^ and denotes the 

cost induced by tt at time t. 

Theorem 5: If the condition in Theorem [T] holds and a 
steady state distribution exists for the backlog process gener- 
ated by QLA, then with a sufficiently large V, we have under 
FQLA-Ideal that, 

U = 0(log2(F)), (51) 
fav = fL + Oil/V), (52) 

Pdrop = 0(l/y^°'°s(V))^ (53) 

where cq = ©(1); U is the time average backlog, f^J is 
the time average cost of FQLA-Ideal, f*^ is the optimal time 
average cost and Pdrop is the time average fraction of packets 
that are dropped in Step-II (b). 

Proof: Since a steady state distribution exists for the 
backlog process generated by QLA, we see that V{D,m) in 
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( [23T l represents the steady state probability of the event that the 
backlog vector deviates from Uy by distance D + m. Now 
since W{t) can be viewed as a backlog process generated 
by QLA, with W^(0) = W instead of 0, we see from the 
proof of Theorem [T] that Theorem [T] and |2] hold for W{t), 
and by [7], QLA based on W{t) achieves an average cost 
of /*^ + 0{1/V). Hence by Theorem|2l there exist constants 
Di,Ki,cl = 9(1) so that: V^^\Di,cKi\og{V)) < ^. By 
the definition of V'^^^Di, cKi\og{V)), this implies that in 
steady state: 

Pr{Wj{t) > U^j +Di+ m} < cle'^ , 

Now let: Qj{t) = ia&yi[Wj{t) - Uy^ - Di,0]. We see that 
Pr{Qj(t) > m} < cle^^, Vm > 0. We thus have Q~ = 
0(1), where Qj is the time average value of Qj{t). Now it is 
easy to see by ^ and ^ that Uj{t) < Q.j{t) + \og^{V) + 
Di + Smax for all t. Thus ( ISTI i follows since for a large V: 

U]<Q]+ log'iV) +Di+ 5,nax = 9(log2(F)). 

Now consider the average cost. To save space, we use FI for 
FQLA-Ideal. From above, we see that QLA based on W{t) 
achieves an average cost of f*^ + 0(1/1/). Thus it suffices 
to show that FQLA-Ideal performs almost the same as QLA 
based on W{t). First we have for all t > 1 that: 

\ E /"^M = \ E + \ E f\r)l,^(ry 

T=0 T=0 T=0 

Here ^e(t) is the indicator function of the event E{t), E{t) is 
the event that FQLA-Ideal performs the same action as QLA 
at time r, and 1 E'=[t) = 1 — 1 e{t)- Taking expectation on both 
sides and using the fact that when FQLA-Ideal takes the same 
action as QLA, /^^(r) = f^^^ir), we have: 

1 

+ - y^E{(5„ta3;l^o(T-)}. 

r=0 

Taking the limit as t goes to infinity on both sides and using 

f'^'^HrnEir) < /^^^(r) , we get: 

T=0 
t-1 

= fl^J-^ + Srnax 1™ -TPr{E^T)}. (54) 
t— >oo t ^ — ' 

T=0 

However, E'^{t) is included in the event that there exists a j 
such that Wj{T) < Wj. Therefore by (|40| in Theorem |2] for 
a large V such that i log^iV) > Di and log{V) > 8A'i, 

1 

lim - VPr{i?^(r)} < r^^HDi,log\V) - D,) 

t— *oo t — ' 

T = 

= 0(ct/F^'°s^^)) 

= 0{1/V^). (55) 



Using this fact in ( |54l i, we obtain: 

= /,^,^^ + 0(<S™a./V^^) = /:„ + 0(1/1^), 

where the last equahty holds since fj^J"'^ = /*„ + 0{1/V). 
This proves ( l52b . ( |53] ) follows since packets are dropped at 
time r only if E'^{t) happens, thus by i55[ . the fraction of 
time when packet dropping happens is O(l/l/'=o '°s(^)) with 
Co = = 9(1), and each time no more than -y/ri? packets 
can be dropped. ■ 

D. The FQLA-General algorithm 

Now we describe the FQLA algorithm without any a-priori 
knowledge of Uy, called FQLA-General. FQLA-General first 
runs the system for a long enough time T, such that the 
system enters its steady state. Then it chooses a sample of 
the queue vector value to estimate Uy and uses that to decide 
the number of place holder bits. 

FQLA-General: 

(I) Determining place-holder bits: 

a) Choose a large time T (See Section IVI-FI for the 
size of T) and initiaUze W{Q) =0. Run the QLA 
algorithm with parameter V, at every time slot t, 
update W{t) according to the QLA algorithm and 
obtain W{T). 

b) For each queue j, define: 

Wj = max [W, (T) - log" (F) , O] , (56) 

as the number of place-holder bits. 
(II) Place-holder-bit based action: same as FQLA-Ideal. 
The performance of FQLA-General is summarized as follows: 
Theorem 6: Assume the conditions in Theorem |5] hold 
and the system is in steady state at time T, then under 
FQLA-General with a sufficiently large V, with probability 
1 - O(^): (a) U = 0(log2(V)), (b) /i^^ ^ f*^ + o^yy), 
and (c) Pdrop = 0{l/V'»'°siV))^ where cq = 9(1) and f^^f^ 
is the time average cost of FQLA-General. 

Proof: We will show that with probability of l — 0{yj), 
yVj is close to max[Uyj — log'^(X/), 0]. The rest can then be 
proven similarly as in the proof of Theorem |5] 
For each queue j, define: 

v+ = C/^,. + i \og\V) , V- = max [C/^^- - i log^ (V) , O] . 

Note that vj' is defined with a max[] operator This is due 
to the fact that Uyj can be zero. As in (ISST l. we see that by 
Theorem 12] there exists Di = 9(1), A'l = 9(1) such that if 
V is such that ilog^(F) > Di and log{V) > 16Ki, then: 

Pr{3j, W,iT) i < V^'^\D^}-\og\V) - D,) 

= 0{1/V^) 

Thus we see that Pr{Wj{T) e [vJ ,v+]yj} = 1-0(1/^-^), 
which implies: 

Pr{Wj G yj} = 1 - Oil/V^). 

where £>+ = max [Uy^- ^\og^{V),0] and vJ = max [Uyj- 
I log^(l/), O] . Hence for a large V, with probability 1 — 
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O(^): if U;,^ > 0, we have L/*^. - f log2(F) < Wj < 
f/{>^- - i log2(F); else if C/*^. = 0, we have = Up^. The 
rest of the proof is similar as the proof of Theorem \5\ ■ 

E. FQLA when goO is locally smooth 

Note that FQLA can also be implemented for problems with 
qo{U) being locally smooth, with the only modification that 
Wj = max[C/{>^ - log^(y)VV, 0]. In this case, the following 
theorem can be obtained: 

Theorem 7: Assume the condition in Theorem |4] holds and 
a steady state distribution for a backlog process under QLA, 
then FQLA-Ideal achieves an [0{l/V),0{log'^{V)VV)] 
performance-delay tradeoff, with Pdrop = 0(1/F^° 
where co = 6(1); similarly, for appropriately chosen T, 
FQLA-General achieves the same performance with proba- 
bility 1 - 0(l/y4). 

F. Practical Issues 

From Lemma [T] we see that the magnitude of Uy can be 
Q{V). This means that T in FQLA-General may need to be 
^{V), which is not very desirable when V is large. We can 
instead use the following heuristic method to accelerate the 
process of determining VV: For every queue j, guess a very 
large Wj . Then start with this W and run the QLA algorithm 
for some Ti, say \/V slots. Observe the resulting backlog 
process. Modify the guess for each queue j using a bisection 
algorithm until a proper W is found, i.e. when running QLA 
from that value, we observe fluctuations of Wj (t) around Wj 
instead of a nearly constant increase or decrease for all j. Then 
let = max[yVj — log^(F), 0] be the number of place-holder 
bits of queue j. To further reduce the error probability, one 
can repeat Step-I (a) multiple times and use the average value 
as W(T). 

Note that even though results in Theorem |5] and |6] assume 
a large V, in practice, the V value may not have to be very 
large (See Section [Villi for an example). 

VII. When there is a single queue 

In this section, we look at the backlog process behavior 
under QLA under the special case when there is only one 
queue in the network. In this case, we have only a single 
traffic constraint in the deterministic problem ( fTOl i: 

Si Si 

where x = (x^^^^ x'^^^-')"^. Thus r = 1 and the Lagrange 
multiplier is a scalar. This single queue setting is useful and 
can be used to model many network optimization problems, 
e.g., [3] and [5]. Below, we first provide deterministic upper 
and lower bounds for U{t). These bounds hold for arbitrary 
network state distribution and the way the state process evolves 
(possibly even non-ergodic). We then obtain a probabilistic 
bound of U{tys deviation from Uy under general single queue 
network optimization problems. The probabilistic bound has 
the same form as those in Theorem [T] and HI but does not 
require any additional conditions such as ( fTSl l and (|42] |. 



A. Deterministic Upper and Lower Bounds of U{t) 

Here we provide upper and lower bounds of U{t) under 
QLA. First define the following problem for each network 
state Si, for i e {1, A/}. 

max qs,{U)^ inf \vf{s^,x^'^^) (57) 

+C/[gi(s„a;(^-))-6i(s„x(^'')]| 

s.t. U>0. 

It is easy to see that qsi (U) is the dual of ( fTOl l when Si is the 
only network state. We now have the following theorem: 

Theorem 8: Assume (ISTT i has a unique optimal solution 
U*. £ [0,oo] for all s^. Consider the interval: 

I = r min U*. - B, max U*. + B] , 

Si ^ Si ^ -' 

if under QLA, there exists > such that U{to) e T, then 
U{t) e I for an t > to. 

Note that here [0, oo] includes the value oo. To prove 
Theorem [8] we use the following lemma. 

Lemma 4: If U{t) ^ Uy, then 

(a) Under QLA, 

E{{U{t)-U;.)[gi{s,,x[^'^yh{s,,x\^'^)] I U{t)} < 0. 

(b) Under OSM, 

iU{t)^U^)[giixu)-Biixu)]<0. 
Proof: See Appendix D. ■ 
Lemma |4] shows that under QLA, if U{t) < Uy, then 
E{gi{s,,x\j'^)~biis„x\;'^)\ U{t)} > 0; else if > U^, 
we have E{5i(s,,x[^'^) - h{s^,x\^'^) \ U{t)} < 0. This 
shows that when S{t) is i.i.d, the backlog value under QLA 
probabilistically moves in the direction towards Uy. When 
there is a single network state, in which case (a) and (b) are 
equivalent, we see that U{t) deterministically moves in the 
direction towards Uy. 

Proof: (Theorem [8]l First we see that, though it is 
possible for some U*. to be infinity, it can be easily shown 
that mills- U*. < oo. Thus T is well defined. 

We now prove the lower bound. The upper bound can 
similarly be obtained. Without loss of generality, assume 
= mins, U*^ and U*^^ = maxg^ [/*.. Suppose at a time t 
we have U{t) G T: 

(1) If U{t) > U;^, we have U{t + 1) > U*^ - B, since B 
is an upper bound of the magnitude change of U{t). 

(2) Now if U;^ >U{t)> U;^ - B, we see that U{t) < J7* 
for all i = Also, when given U{t) and S{t) = Si, 
QLA's action is the same as OSM applied to dSTl l. Thus by part 
(b) of Lemmagl we see that 5i(x;7)-'Bi(a;c/) = A{t)-p{t) > 
0, hence by © we have U{t + 1) > U{t) > U*^ - B. ■ 

Note that we did not use any assumption of the network 
state process in the above proof, hence the result holds for 
arbitrary network state distribution and the way S{t) evolves. 
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B. Probabilistic bound ofU(t)'s deviation from Uy 

In this section we provide a probabilistic bound of C/(t)'s 
deviation from Uy. The bound has a similar form as those in 
Theorem [T] and |4] but only applies to general single queue 
optimization problems. However, the bound here does not 
require additional conditions such as (fTSl l and (l42T i. Hence 
it is more general than the previous results when restricted to 
single queue optimization problems. Recall that V{D,m) is 
defined in ( |23] | as: 

1 ''^ 

V{D, m) = lim sup - ^ Pr{\U{T) - Uy \ > D + m}. 

Theorem 9: Under QLA, there exist constants d, a* , p* > 
0, possibly dependent on V, such that: 

V{d,m) <a*e-P'"'. (58) 
Theorem|9]shows that the probability to deviate further from 

Uy will eventually be exponential. To prove Theorem |9] we 

need the following lemmas: 
Lemma 5: q{Uy) > — oo. 

Lemma 6: Under QLA, if (a) < Ui < U2 < U^ or (b) 

0<U^ <Ui< U2, then: 

E{.gi(,s,,x[;;))-6i(s.,x[;;))| (7i(t)} 

>E{5i(s„x[;;))-&i(s„a.[;;^)| [/2W}. 

In case (a), both quantities are positive; while in case (b), both 
quantities are negative. 
Lemma 7: Under QLA, 

([/^-C/(t))E{[.gi(s„a;(^'')-6i(s„x(^'))] | U{t)). (59) 

>q{U*y)~q{U{t)) 
Lemma |5] follows easily from the e-slackness assumption in 
Section IIII-BI Lemma |6] can be viewed as saying that when 
U{t) deviates more from Uy, the chosen action generates 
a larger drift towards Uy. Lemma |7] can be viewed as the 
subgradient property under QLA. Lemma |6] and |7] are proven 
in Appendix E. We now take the following approach to prove 
Theorem |9] We first use Lemma |5] and |7] to find a single 
U(t) value, whose drift value is large enough for analysis, 
and then conclude by Lemma |6] that any other U(i) that is 
further away from Uy generates a larger drift. Then we carry 
out the same drift analysis as in the proof of Theorem [T] to 
obtain the probability bound. 

Proof: (Theorem |9]l Since r = 1, we have the dual 
function being: 

q{U)= inf 

Si 



Now by the e-slackness assumption in Section IIII-BI and the 
fact that the cost functions are bounded by 6max, it can easily 
be shown that: 

q{u)<v5^,a.~eu yu>o. 



Hence if q{U) > q{Uy) — eo for some eq > 0, then we have: 

eo > - q{U) > q{U^) + eU - VSma.. 

which by Lemma |5] implies: 



TT ^ eo + VS„iax ~ q{Vy) 

U < — < 00. 



(60) 



Now fix an eo > 0, define the set Se„ ^ {U > \ q{U) > 
qiUy) - eo}- Define: 



d(y,eo)= sup \U-U^\ 



(61) 



By ( l60b we see that d{V, eo) G (0, 00). Also whenever \U{t)- 
Uy \ > rf(V", eo), we have: 

<7(C/{>) - q{U{t)) > eo. (62) 

Thus by Lemma|2l we see that when \U(t) - Uy\ > diV, eo), 

(C/^ - U{t))E{{g^{s,,x\^'^) - h{s,,x\^'^)) I U{t)} > eo. 

Now consider Uy > d{V, eo) + ei for some small ei > 0. 
Define Ui = Uy — d{V,eo) — ei. From above and Lemma |7] 
we see that if U{t) — Ui, then: 



(63) 



E{{g,{s,,x[j'^) - 5i(5„4^-')) I [/(<)} > - 

"(,''1 eo) + ei 

Denote rj^i = ^(yl^^^_^_^^ ■ It is easy to see by ^ that rj^ < B. 
Using Lemma |6l we see that (l63T l holds for all U (t) < Ui — 
Uy~d{V, eo) — ei. A similar argument will show that whenever 

Uit)>Uu = U^ + d{V,eo) + ei, 

E{(gi(s„x[;'V6i(s„x[;'^))| C/(i)} < (64) 

Now let d = d{V, eo) + ei and define: 

r(i) = max{|[/(i) - C/y| - d,0}, (65) 

then whenever Y{t) > B, we have 

E{Y{t + l)-Y{t)\U{t)} <-r,d. 

Also \Y{t + l) -Y{t)\ < B for all t. We can now carry out 
a similar argument as in the proof of Theorem [T] and obtain: 

1 * 

lim sup - V ^e^™'Pr{y(T) > m} < e2"^,(66) 

r— 1 

where w = „, — j:;. Thus we have: 



2(^2 + B?/d/3)e« + ''c 



/3 Ijr 



V{d,m)<— '■ ^^^-^^^^ — e ^B^d/^ (67) 



Therefore (|58ll holds with: 



2Vd 



... 2{B^ + B7^d/3)e^^ , rjd 



52 + B-qd/i 



Now if Uy-d{V, eo)-ei < 0, then we have Uy-U{t) < d 
whenever U{t) < Uy. Thus the {Y{t) > m} is simply the 
event that U{t) > Uy + d + m. It is easy to see from above 
that ( |67] ) also holds in this case. ■ 
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To see how Theorem |9] is related to Theorem [T] and |4l first 
consider ( fTSl ) holds for all [/ > 0. In this case, for a fixed 
eg = 6(1), we have for all U G S^o that: 

eo > - q{U) > L\U - U^\. 

Thus d{V,eo) = B(l), which then implies rjd, p* and a* are 
all 0(1). Thus by ( |67| | we see that U{t) will mostly be within 
0{\og{V)) distance from Uy, as stated in Theorem[T] Now if 
(gill holds for all [/ > 0, then we see from (05]) that; 

eo>qm)~qiU)>^\U-U^\^ V C/ £ 

This implies d(y,eo) = 0(\/V^) and rid =^ n{l/VV). 
Thus p* = nil/VV) and a* = 0{V) and again C/(t) is 
mostly within 0{VV\og{V)) distance from Uy, as shown in 
Theorem |4l 

VIII. Simulation 

In this section we provide simulation results for the FQLA 
algorithms. For simplicity, we only consider the case where 
qo{U) is locally polyhedral. We consider a five queue system 
that extends the example in Section IIII-DI In this case r = 
5. The system is shown in Fig. |4] The goal is to perform 
power allocation at each node so as to support the arrival with 
minimum energy expenditure. 

Sl(t) S2(t) S3(t) S4(t) S5(t) 




Fig. 4. A five queue system 

In this example, the random network state S{t) is the vector 
containing the random arrivals R{t) and the channel states 
Si{t), i = 1, ...,5. Similar as in Section UlI-DI we have: 

A{t) = iR{t),pi{t),H2it),P3{t),fl4{t)f, 
f^{t) = {Pl{t),P2{t),p.3{t),fJ.4{t),P5{t))'^, 

i.e., Ai{t) = R{t), A,{t) ^ fit-lit) for i > 2, where p,{t) 
is the service rate obtained by queue i at time t. R{t) is 
or 2 with probabilities | and |, respectively. Si{t) can be 
"Good" or "Bad" with equal probabilities for 1 < i < 5. 
When the channel is good, one unit of power can serve two 
packets; otherwise one unit of power can serve only one 
packet. We assume all channels can be activated at the same 
time without affecting others. It can be verified that Uy = 
{5V, 4:V, 3V, 2V, V)'^ is unique. In this example, the backlog 
vector process evolves as a Markov chain with countably 
many states. Thus one can show that there exists a stationary 
distribution for the backlog vector under QLA. 

We simulate FQLA-Ideal and FQLA-General with V = 
50, 100, 200, 500, 1000 and 2000. We run each case for 5 x 
10^ slots under both algorithms. For FQLA-General, we use 
T = 50V in Step-I and repeat Step-I 100 times and use 
their average as W{T). It is easy to see from the left plot 
in Fig. ID that the average queue sizes under both FQLAs 
are always close to the value 51og^(V^) (r = 5). From the 
middle plot we also see that the percentage of packets dropped 



decreases rapidly and gets below 10^^ when V > 500 under 
both FQLAs. These plots show that in practice, V may not 
have to be very large for Theorem |5] and |6] to hold. The 
right plot shows a sample {Wi{t),W2{t)) process for a 10'^- 
slot interval under FQLA-Ideal with V = 1000, considering 
only the first two queues of Fig. |4] for this example. We 
see that during this interval, {Wi{t),W2{t)) always remains 
close to iU^i,U;-2) ^ (5^,4^), and Wi{t) > Wi = 4952, 
W2{t) > W2 = 3952. For all V values, the average power 
expenditure is very close to 3.75, which is the optimal energy 
expenditure, and the average of ^ Wj{t) is very close to 15V 
(plots omitted for brevity). 




Fig. 5. FQLA-Ideal performance: Left - Average queue size; Middle - 
Percentage of packets dropped; Right - Sample (Wi(i), W2{t)) process for 
t G [10000, 110000] and V = 1000 under FQLA-Ideal. 

IX. Lagrange Multiplier: "shadow price" and 
"network gravity" 

It is well known that Lagrange Multipliers can play the 
role of "shadow prices" to regulate flows in many flow-based 
problems with different objectives, e.g., [16]. This important 
feature has enabled the development of many distributed al- 
gorithms in resource allocation problems, e.g., [17]. However, 
a problem of this type typically requires data transmissions to 
be represented as flows. Thus in a network that is discrete in 
nature, e.g., time slotted or packetized transmission, a rate 
allocation solution obtained by solving such a flow-based 
problem does not immediately specify a scheduling policy. 

Recently, several Lyapunov algorithms have been proposed 
to solve utility optimization problems under discrete network 
settings. In these algorithms, backlog vectors act as the "grav- 
ity" of the network and allow optimal scheduling to be built 
upon them. It is also revealed in [14] that QLA is closely 
related to the dual subgradient method and backlogs play the 
same role as Lagrange multipliers in a time invariant network. 
Now we see by Theorem [T] and |4] that the backlogs indeed 
play the same role as Lagrange multipliers even under a more 
general stochastic network. 

In fact, the backlog process under QLA can be closely 
related to a sequence of updated Lagrange multipliers under 
a subgradient method. Consider the following important vari- 
ant of OSM, called the randomized incremental subgradient 
method (RISM) [12], which makes use of the separable nature 
of (fljT l and solves the dual problem ( fTTT i as follows: 

RISM: Initialize C/(0); at iteration t, observe U{t), choose 
a random state S{t) £ S according to some probability law. 
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(1) If S{t) = s„ find x'^'^ e A"'^-) that solves the following: 

min Vf{s„x) + '^Uj{t)[gj{si, x) - bj{si,x)] 

3 

s.t. x€X'-'^\ (69) 

(2) Using the .t[^''' found, update U{t) according to: 



Uj{t + 1) = max 



+ a*gj{s^,x\^'^). 



As an example, S{t) can be chosen by independently 
choosing S{t) — Si with probability ps- every time slot. In this 
case S{t) will be i.i.d.. Note that in the stochastic problem, a 
network state Si is chosen randomly by nature as the physical 
system state at time t; while here a state is chosen artificially 
by RISM according some probability law. Now we see from 
(El and (HHi that: given the same JJ{V) and s^, QLA and RISM 
choose an action in the same way. If also a* ~ 1 for all 
t, and that S{t) under RISM evolves according to the same 
probability law as S{t) of the physical system, we see that 
applying QLA to the network is indeed equivalent to applying 
RISM to the dual problem of ( 1701 ), with the network state 
being chosen by nature, and the network backlog being the 
Lagrange multiplier Therefore, Lagrange Multipliers under 
such stochastic discrete network settings act as the "network 
gravity," thus allow scheduling to be done optimally and 
adaptively based on them. This "network gravity" functionality 
of Lagrange Multipliers in discrete network problems can 
thus be viewed as the counterpart of their "shadow price" 
functionality in the flow-based problems. Further more, the 
"network gravity" property of Lagrange Multipliers enables 
the use of place holder bits to reduce network delay in network 
utility optimization problems. This is a unique feature not 
possessed by its "price" counterpart. 

Appendix A- Proof of Lemma[2] 

Here we prove Lemma |2] First we prove the following 
useful lemma. 

Lemma 8: Under queueing dynamic (|4]i, we have: 



2B^ 



\\U{t+l)-U*y\\' < \\U{t)-Ul 

-2{U*y-U{t)f{A{t)-f^{t)). 
Proof: (Lemma |8]l From (|4]i, we see that U{t + 1) 
is obtained by first projecting U{t) — fi{t) onto R!j_ and 
then adding A{t). Thus we have (we use [a;]+ to denote the 
projection of x onto R!^): 

\\U{t+l)-U*y\\^ 

= \\[U{t)-t,(t)]++A(t)-U*yr 
= {[U{t)-f^it)]++Ait)-U*yf 

{[U{t)-f,{t)]++A{t)-U*y) 



[U{t) - 



u 



ut 



+2{[U{t)^(^{t)]+-U*yyA{t) + \\A{t)\\ 



(70) 



'Note that this update rule is different from RISM's usual rule, i.e., Uj (t + 
1) = max [Uj{t) — a^bj{si, x) + a^gj{si,x),0\, but it almost does not 
affect the performance of RISM. 



Now by the non expansive property of projection [12], we 
have: 

{[Uit) - t,it)]+ U*y)'^{[Uit) - f,it)]+ U*y) 
< {U{t)-fl{t)-U*yf{U{t)-,,{t)-U*y) 
= \\U{t) -U*yf + \\Kt)f - 2(C/(t) - U*yft,{t). 

Plug this into (fTOl i. we have: 

\\U{t+l)-U*yW' (71) 
< \\U{t) U*yr + MtW 2(C/(i) - U*yftlit) 

+ \\Mm' + 2([C/(t) - - U*yfAit). 

Now since U{t), A{t) >z 0, it is easy to see that: 



{[U{t)-f,{t)]+)'^A{t)<U{tfA{t). 
By (ItTI i and (|72] | we have: 



(72) 



\\U{t+l)-U*yf 

< \\Uit) U*yr + Mt)r 2(C/(t) - Wy fflit) 

+ \\A{t)f + 2{U{t)-U*y)'^A{t) 

< \\U{t) -U*yr + 2B^ - 2{U*y - U{t)f{A{t) - f,{t)), 
where the last inequality follows since ||j4(^)||^ < and 

\\f,{t)r<B\ m 

We now prove Lemma |2] 

Proof: (Lemma|2]i By Lemma[8]we see that when S{t) ~ 
Si, we have the following for any network state Si with a given 
U{t) (here we add superscripts to U{t + 1), A{t) and fi{t) 
to indicate their dependence on s;): 



\U^'^\t + 1) - Ulrf < \\Uit)-U*y\\^ 



2B^ 



(73) 



-2iU*y-Uit)fiA^^'\t)-f^(^^\t)). 



By definition, A'-'''\t) = gj(s,,x^'0, and fi]''''{t) = 
bj{si, x^^''^), with a:|^'-' being the solution of ([Hjl for the given 
U{t). Now consider the deterministic problem ( fTOl l with only 
a single network state Si, then the corresponding dual function 
(fT2l i becomes: 



qs^{U{t))^ inf ^\vf{s,,x^'^^) 



(74) 



Therefore by O we see that {A'^'''\t) - M^'^'HO) is a 
subgradient of qs.{U) at U{t). Thus by (fTSI l we have: 

{U*y~U{t)f{A^'^\t)~^l'^^^\t)) (75) 

>qsAUv)-^sAU{t)). 

Plug (l75ll into (|73ll, we get: 

\\U^'^\t + l)-U*y\\^ < \\U{t)-U*yf + 2B''- (76) 
- 2{q,XU;,)-qsAU{t))). 

More generally, we have: 



\U{t + l)-U*y\\^ < \\U{t)-U 



* l|2 

v\\ 



2B^ 



(77) 



-2{qsit){U*y)~qsit){U{t))). 
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Now fix > 0, summing up ( fTTT i from time t to t + — 1, 
we obtain: 



\\uit + - u*yr < Wit) - u*yr + 2nB' 

-2 E [qsit+r)m)^qsit+r)iUit + r))] 



(78) 



Adding and subtracting the term 2 X]t=o^ Qs{t+T) {U{t)) from 
the right hand side, we obtain: 



(79) 



\\Uit + T,) - U*y\\' < \\Uit) -U*y\\' + 2T,B' 

-2 E [qsit+r){U*y)~qsit+r)iUit))] 

T = 

T„-l 

+ 2 E [qS{t+r){U{t + r)) - qsi^t+r){U{t))\ . 



Since ||i7(t) -(7(i + r)|| < tB and || < 
B, using (ITST i and the fact that for any two vectors x and y, 
x^y < ||a;j|||y||, we have: 



qsit+r){Uit + r)) - gs(t+r)(C/(t)) < tB^ 



(80) 



Hence: 

T„-l 



T=0 



E [gs(t+r)([/(i + T)) -g5(t+r)([^W)] 

< E {rB^)^^{T;B^-nB^). 



4=1 



Adding and subtracting 2T,Y.T=iPs. [qsAU*v) - qsAU{t))\ 
from the right hand side, we have: 

¥.{\\U{t + T,)-U*yf\Z{t)} (82) 
< E{||C/(t) - U*yf I Z(t)} + [T^ + T,)B^ 

AI 

-2nY,PsA<lsAUy)-qsAU{t))] 



M 



-2T,E{ J2 



i=l 



\rs,{i.T,)\\ 



[qsdU*y)-qsAU{t))]\Z(t)}. 
Denote the term inside the last expectation of (l82l i as Q, i.e.. 



2-E 



\Ts.it,n)\\ 



[qs,iU*y)-qs^{U{t))]. (83) 



Using the fact that qs.{Uy) — (/s- (C/(t)) is a constant given 
Z{t), we have: 

E{Q\Z{t)} 



E 



M 

^E 

i=l 



T 



x[q,^{U*y)-qs,{U{t))] 
E{\\Z,it,T,)\\\Zit)} 



xW^{U*y)-qs^{U{t))\ 



r=0 

Plug this into ( |79] l, we have: 

!|i7(t + T,)-[/t.i|' < ||C/W-[/t'f + (T2 + T,)i32 (81) 

T„-l 

-2 E [95(t+r)(C/t.) - qsit+r){U{t))\. 

T=0 

Now denote = ('H(t), i.e., the pair of the history 

up to time t, Ti.{t) = {5'(r)}^~Q and the current backlog. 
Taking expectations on both sides of (ISTT i. conditioning on 
Z{t), we have: 

E{\\U{t + T^)-U*yf\Z{t)] 

< E{\\Uit) -U*y\f\ Zit)} + [Tl + r,)B2 

T„-l 

-2E{ E [9s(t+r)(C/t.)-g5(f+r)(C/(t))] I 



Since the number of times qs^{U) appears in the interval 

[t,t + Ti, — 1] is \\Ts.{t,Tiy)\\, we can rewrite the above as: 

m\u{t+n)-u*yr\z{t)} 

< E{\\U{t) -U*yf\ Z{t)} + [T^ + r,)B2 
-2T.E{E "^"ii'^^^" [qsAUy)-qsAUm \ Z{t)}. 



By m,qsAUv) - q.,{U{t)) < B\\U*y - U{t)\\, thus we 
have: 

E{Q\Z{t)} < B\\U*y-U{t}\\ 



M 

xE 

i=l 

< iyB\\U';,-Uit)\\, 



E{\\%,it,n)\\\ Zit)} 



(84) 



where the last step follows from the definition of T^. Now by 
O and dllli: 



M 



J2p.n [qsAU*y) - qs.{U{t))] = q{U*y) - q{U{t)). 



i=l 



Plug this and ( |84l i into (|82]i.we have: 

E{||i7(t + T,)-[/t,|n 

< E{||(7(t) - I Z(t)} + (Tj + r,)i32 

-2T,((z(i7t-) - q{U{t))) + 2T„vB\\U*y - U{t)\\ 

Recall that Z{t) ~ {'H{t),U{t)). Taking expectation over 
7i(t) on both sides proves the lemma. ■ 

Appendix B - Proof of ( |3T1 ) 

Here we prove that for Y{t) defined in the proof of part (b) 
of Theorem [T] we have: 



^TjY{t)) < e 



2wT„B _ "^V wY(t) 
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for all Y{t) > Proof: If Y{t) > T^B, denote S{t) = 
Y(t + T„)-Y{t). It is easy to see that \5{t)\ < T„B. Rewrite 
(EOll as: 

ArAYit)) = e"'^WE{(e"'*W -1) I t/(t)}. (85) 
By a Taylor expansion, we have that; 



1 + ivS{t) 



giwSit)), (86) 



[18] has the 



where g{y) =^ '^Y.k'=2h^ 
following properties: 

1) 5(0) = 1; 9{y) ^ 1 for y < 0; g{y) is monotone 
increasing for y > 0; 

2) For y < 3, 

,fe-2 



fc=2 fc=2 

Thus by ( [86] l we have: 



1 



y/3' 



) < 1 + ^^(t) + 



■9{wT„B). (87) 



Plug this into (|85]l, and note that > T^^B, so by (|29ll we 
have E{(5(t) | (7(<)} < -/y. Hence: 

AT„(^(i)) < e"'^(*)( - w7^+ '"^'^^^\ {wT,B)). (88) 
Choosing = ^2^2^^ we see that wT^B < 3, thus: 



2 ^v-- -"^ - 1 -wT^B/3 ~ ~2~' 

where the last equality follows since: 



r2B2 + T,Br//3 



wTJB^ = 77- wr^B?7/3 



1 



1 -iyT^S/3 
Therefore dSHI l becomes: 



t = fc + 1. We first note that if /7j(fc) < A*j (fc), the the result 
holds since then + 1) = ^j{k) - Mi(fc)]'^ + ^j(fc) = 
ylj(fc) < (5maa;- Thus wc will considcr Uj{k) > fij{k) in the 
following: 

(A-I) Suppose Wj{k) > Wj. Note that in this case we have: 



Uj{k) < Wj{k)^Wj+dr, 



(90) 



(89) 



Now if Y{t) < T^B, it is easy to see that AT^{Y{t)) < 

g2^T„B _ g»r(t) < g2^T„B _ n^^wYit)^ ^ y^) < 

T^B + Y{t) < 2T^B and ^ < 1, as 77 < T^B. Therefore for 
all Y{t) > 0, we see that ^ holds. ■ 

Appendix C-Proof of Lemma[3] 

Here we prove Lemma |3] To save space, we will sometimes 
use [x]~^ to denote max[a;,0]. Proof: It suffices to show 
that ( fSOb holds for a single queue j. Also, when Wj — 0, (ISOl l 
trivially holds, thus we only consider Wj > 0. 

Part (A): We first prove Uj{t) < max[H/j(t) - Wj,0] + 
5max- First we see that it holds at t = 0, since Wj(0) = Wj 
and Uj{t) = 0. It also holds for t = 1. Since Uj{0) = and 
VFj(O) = Wj, we have C/, (l) = Aj (0) < Smax- Thus we have 
C7,(l) < max[M^,(l) - W„ 0] + S„,ax. 

Now assume C/j(i) < max[Wj(t) — Wj,0] + S,nax holds 
for t = 0,1, 2, k, we want to show that it also holds for 



Also, Uj{t + 1) = ma.x[Uj{t) - Hj{t),0] + Aj{t). Since 
Uj{k) > /ij(fc), we have: 

t/j (fc + 1) = Uj{k) - /ij (fc) + Aj (k) 

< Wj (k) - Wj + Srnax - f^j{k) + Aj (k) 

< [Wj (k) ~ flj (fc) + Aj (fc) - W,]+ + 5^ax 

< [[Wj (k) - flj{k)]+ + Aj (k) - Wj] + + Smax 

= miix[Wj{k + l)-Wj,0]+S„,ax, 

where the first inequality is due to (|90] |, the second and third 
inequalities are due to the operator, and the last equality 
follows from the definition of Wj{k + 1). 

(A-II) Now suppose Wj{k) < Wj. In this case we have 

Uj{k) < Srnax, Aj{k) = [Aj (k) - Wj + Wj {k)]+ and: 

C/,(fc + l) = [(7,(fc)-M,(fc)]++i,(fc). 

First consider the case when Wj{k) < Wj — Aj{k). In this 
case Aj{k) = 0, so we have: 

Uj{k + 1) = Uj{k) - ^lj{k) < Sniax - ^J■]{k) < Smax, 

which implies Uj{k + 1) < max[Wj(fc + 1) - Wj,0] + Smax- 
Else if Wj - Aj{k) < Wj{k) < Wj, we have: 

Uj{k+l) = Uj{k) - flj (k) + Aj (fc) - Wj + Wj (fc) 

< Wj{k)-Wj+6 

max 

fj,j{k) + Aj{k) 

< max[Wj (fc + 1 ) - Wj , 0] + S^ax , 

where the first inequality uses Uj{k) < Smax and the second 
inequality follows as in (A-I). 

Part (B): We now show that Uj{t) > max[Wj{t) - Wj, 0]. 
First we see that it holds for i = since Wj(0) — Wj. We 
also have for t = 1 that: 

[W-,(l)-W,]+ = [[Wj{0)^tijm++Aj{0)-Wj]'' 
< [[W-,(0)-/i,(0)-W,]++A,(0)] + 
= A,(0) 

Thus Uj{l) > max[Wj{l) - Wj,0] since Uj{l) = Aj{0). 
Now suppose Uj{t) > max[Wj(t) - Wj,0] holds for t = 
0,1, ...,k, we will show that it holds for t = k + 1. We note 
that if Wj{k + 1) < Wj, then max[Wj(fc + 1) - Wj,0] = 
and we are done. So we consider Wj{k + 1) > Wj. 

(B-I) First if Wj (fc) > Wj, we have Aj (k) ^ Aj (k). Hence: 

[Wj (fc + 1) - Wj]+ = [Wj (fc) - (fc)]+ + Aj (fc) - Wj 

< [WJ{k)-^,J{k)-WJ]+ + AJ{k) 

< [[VF,(fc)-W,]+-M,(fc)]++A,(fc) 

< [Uj{k)- fij{k)]+ + Aj{k), 

where the first two inequalities are due to the [.t]+ operator 
and the last inequality is due to Uj{k) > [Wj{k)-Wj]'^ . This 
implies [Wjik + 1) - Wj]+ < Uj{k + 1). 
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(B-II) Suppose Wj{k) < Wj. Since Wj{k + 1) > W^, we 
have Wj ~ Aj{k) < Wj{k) < W-j, for otherwise W-j{k) < 
W,-A,(fc) andM/,(fc+l) = [W,{k)-^l,{t)] ++A,{t) < W,. 
Hence in this case Ajik) Aj(fc) - Wj + Wj{k) > 0. 

[Wj{k + 1)-W,] + 
= [Wjik)-t,,ik)]+ + Ajik)-W, 

< [W, (k) + U, (fc) - (fc)]+ + (fc) - 

< [;7, (fc) - (fc)]+ + A, (k) - W, + W, (k) 

where the two inequalities are due to the fact that Uj{k) > 
andWj{k)>0. ■ 



Thus similar as ( |92] i. we have: 

qiU^)>E{Vf{s,,x^'^^) (97) 

+{Uit) - [/^)E{gi(s,;,.T(^')) - | t/(t)}. 

Now by (|96] l we see that q{Uy) is the minimum of the 
expected value of (|95] l given [/y, we have: 

<?([/*) < E{\//(s„x(^')) (98) 
+[/^[gi(s,;,.T(^'))-6i(s„a;(^'))] | (7(t)}. 

Subtract the right hand side of (|98l l from both sides of (|97] l 
and use (|98] |. we see that Part (a) follows. ■ 



Appendix D-Proof of LemmaH] 

Here we prove Lemma |4] Recall that we use xu to denote 
the vector {x^^^\x^^"-\ ...^x^^^'^Y chosen by OSM for a 
given U{t), i.e., xu achieves the infimum of (fTZt at U{t). 
Proof: Now from the definition of q{U{t)), we have: 

qiUit)) = Tixu) + U{t)[g,{xu)~Biixu)] 

= T{xu) + U^[gi{xu)-Bi{xu)] (91) 
+ (f/(i)-t/y)[ei(a;c/)-Si(a;c/)]. 

Using the fact that q{U{t)) < q{U^) for U{t) ^ U^, we have: 

qiU^) > T{xu) + U;,[gi{xu)^Bi{xu)] (92) 
+{U{t) - U^)[gi{xu) - Bi{xu)]. 

This then implies: 

{U{t)^U^)[gi{xu)^Bi{xu)] (93) 
< Q(f/^) - U;,[giixu)-Biixu)]. 

However, since: 

q{u^)= inf {Hx) + u;.[g,{x)-B^{x)]}, 

we have the right hand side of (|93T l being non-positive. 
Therefore: 



(;7(t)-[/^)[gi(a;c;)-Si(a;c;)] < 0. 



(94) 



This proves (b). Now note that under QLA, if the network 

(s) 

state is Si then the chosen action Xjj minimizes: 

Vf{s,, + U{t) x^-'''>) - b,{s,, x^-'''>)] , (95) 

over X^'-^ for the given U{t). Therefore given U{t), the 
expected value of the above quantity, i.e., 

J2PsJvf{.s,, + U{t) - his^x^^^^)]], 

is minimized under QLA. Compare this fact to the definition 
of q{U) in ( fT3] l. we see that under QLA: 

q(t/(t))==E{F/(s„x(^-)) (96) 
+C/(t)[gi(s„x(^'))-6i(s„x(^'')] I U{t)]. 



Appendix E-Proof of Lemma|6]and[7] 

Proof: (Lemma |6]l We will prove the case when < 
Ui < U2 < Uy, the other case can be similarly proven. First 
we have the following for the dual function: 

q{Ui) = nxu^) + Ul[gl{xu,)-Blixu,)] (99) 
= T{xu,) + U2[giixu,)~Biixu,)] 

+ {Ui - U2)[gi{xu,) - Bi{xu,)]. 

From the definition of q{U2) and X;j^, we see that: 

qiU2) = Tixu,) + U2[giixu,)-Bi{xu,)] 

< Tixu,) + U2[gi{xuJ-Bi{xu,)]. (100) 

Plug ( II 00b into we have: 

q{Ui) > T{xu,) + U2[gi{xu,)-Bi{xu,)] 

+ {Ui - U2)[gi{xu,) - Bi{xu,)] 
= J^ixu^) + Ui[gi{xu2) - Biixu^)] 

+ {Ui-U2)l^[gi{xu,)-Bi{xu,)] (101) 



Now similar as in dlOOI l we have q{Ui) < !F{xjj^) 
Ui [gi {xjj^ ) - Bi {xu:, )] . Therefore from ( [ToT] ) we obtain: 

0>iUi-U2)l^[gi{xu,)-Biixu,)] 

-[Giixu2) - Bi{xu^)] 

Since Ui < U2, gi(,xui) - Bi{xuJ > giixu^) - Bi{xu2)- 
Similar as in the proof of LemmalU we see that we also have: 

E{gi(s„x[;;V&i(s.,4';^)i t^i} 

>E{gi(s„x[;;))-6i(5„x[;;))| C/2}. 

From Lemma |4] Part (a) we see that they are both positive. ■ 
Proof: (Lemma [TJ Note that from (fTsT l, we have: 

> qiU:^) - q{U{t)). 
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This leads to the following inequality: 

r 

^(C/^, -C/,(t))E{[g,(s„x(^-))-6,(s„x(-^-))] I U{t)} 
i=i 

> q{U^) - q{U{t)). 

Taking r = 1, we see that Lemma [T] follows. ■ 
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